PatchSizePilot

Biomass

Aim

Here I study how biomass density changes across treatments in the PatchSizePilot. In particular, I’m studying how biomass density changes across:

  • ecosystems of different size (this is the case in nature, but we need to make sure that it’s the case also in our experiment)
  • ecosystems that are connected (through the flow of nutrients) to an ecosystem of the same size or to an ecosystem that is larger
  • meta-ecosystems in which patch size is the same or in which one patch has most of the area (we keep the total area of the meta-ecosystem constant)

Data manipulation

Import

culture_info = read.csv(here("data", "PatchSizePilot_culture_info.csv"), header = TRUE)
load(here("data", "population", "t0.RData")); t0 = pop_output
load(here("data", "population", "t1.RData")); t1 = pop_output
load(here("data", "population", "t2.RData")); t2 = pop_output
load(here("data", "population", "t3.RData")); t3 = pop_output
load(here("data", "population", "t4.RData")); t4 = pop_output
load(here("data", "population", "t5.RData")); t5 = pop_output
load(here("data", "population", "t6.RData")); t6 = pop_output
load(here("data", "population", "t7.RData")); t7 = pop_output
rm(pop_output)

Tidy

#Column: time
t0$time = NA
t1$time = NA

#Column: replicate_video
t0$replicate_video = 1:12 #In t1 I took 12 videos of a single 
t1$replicate_video = 1 #In t1 I took only 1 video/culture
t2$replicate_video = 1 #In t2 I took only 1 video/culture
t3$replicate_video = 1 #In t3 I took only 1 video/culture
t4$replicate_video = 1 #In t4 I took only 1 video/culture
t5$replicate_video = 1 #In t5 I took only 1 video/culture
t6 = t6 %>%
  rename(replicate_video = replicate)
t7 = t7 %>%
  rename(replicate_video = replicate)

#Create an elongated version of t0 so that each of the 110 cultures can have 12 video replicates at t0.
elongating_t0 = NULL
for (video in 1:nrow(t0)){
  
  for (ID in 1:nrow(culture_info)) {
    
    elongating_t0 = rbind(elongating_t0, t0[video,])
    
    }

  }

ID_vector = rep(1:nrow(culture_info), 
                times = nrow(t0))

elongating_t0$culture_ID = ID_vector

#Merge previous data-sets
t0 = merge(culture_info,elongating_t0, by="culture_ID")
t1 = merge(culture_info,t1, by = "culture_ID")
t2 = merge(culture_info,t2, by = "culture_ID")
t3 = merge(culture_info,t3, by = "culture_ID")
t4 = merge(culture_info,t4, by = "culture_ID")
t5 = merge(culture_info,t5, by = "culture_ID")
t6 = merge(culture_info,t6, by = "culture_ID")
t7 = merge(culture_info,t7, by = "culture_ID")
ds_biomass = rbind(t0, t1, t2, t3, t4, t5, t6, t7)
rm(elongating_t0, t0, t1, t2, t3, t4, t5, t6, t7)

ds_biomass$time_point[ds_biomass$time_point=="t0"] = 0
ds_biomass$time_point[ds_biomass$time_point=="t1"] = 1
ds_biomass$time_point[ds_biomass$time_point=="t2"] = 2
ds_biomass$time_point[ds_biomass$time_point=="t3"] = 3
ds_biomass$time_point[ds_biomass$time_point=="t4"] = 4
ds_biomass$time_point[ds_biomass$time_point=="t5"] = 5
ds_biomass$time_point[ds_biomass$time_point=="t6"] = 6
ds_biomass$time_point[ds_biomass$time_point=="t7"] = 7
ds_biomass$time_point = as.character(ds_biomass$time_point)

ds_biomass$day = NA
ds_biomass$day[ds_biomass$time_point== 0] = 0
ds_biomass$day[ds_biomass$time_point== 1] = 4
ds_biomass$day[ds_biomass$time_point== 2] = 8
ds_biomass$day[ds_biomass$time_point== 3] = 12
ds_biomass$day[ds_biomass$time_point== 4] = 16
ds_biomass$day[ds_biomass$time_point== 5] = 20
ds_biomass$day[ds_biomass$time_point== 6] = 24
ds_biomass$day[ds_biomass$time_point== 7] = 28

#Column: eco_metaeco_type
ds_biomass$eco_metaeco_type = factor(ds_biomass$eco_metaeco_type, 
                             levels = c('S', 
                                        'S (S_S)', 
                                        'S (S_L)', 
                                        'M', 
                                        'M (M_M)', 
                                        'L', 
                                        'L (L_L)', 
                                        'L (S_L)'))

ecosystems_to_take_off = 60 #Culture number 60 because it was spilled
ds_biomass = ds_biomass %>%
  filter(! culture_ID %in% ecosystems_to_take_off)

ds_for_evaporation = ds_biomass

ds_biomass = ds_biomass %>% 
  select(culture_ID, 
         patch_size, 
         disturbance, 
         metaecosystem_type, 
         bioarea_per_volume, 
         replicate_video, 
         time_point,
         day,
         metaecosystem, 
         system_nr, 
         eco_metaeco_type)

ds_biomass = ds_biomass[, c("culture_ID", 
            "system_nr", 
            "disturbance", 
            "time_point",
            "day", 
            "patch_size", 
            "metaecosystem", 
            "metaecosystem_type", 
            "eco_metaeco_type", 
            "replicate_video",
            "bioarea_per_volume")]

Create regional data set

ds_regional = ds_biomass %>%
  filter(metaecosystem == "yes") %>%
  group_by(culture_ID, 
           system_nr, 
           disturbance, 
           day, 
           time_point, 
           patch_size, 
           metaecosystem_type) %>%
  summarise(patch_mean_bioarea_across_videos = mean(bioarea_per_volume)) %>%
  group_by(system_nr, disturbance, day, time_point, metaecosystem_type) %>%
  summarise(regional_mean_bioarea = mean(patch_mean_bioarea_across_videos))

metaecosystems_to_take_off = 40 #System 40 was the system of culture 60 that I spilled
ds_regional = ds_regional %>%
  filter(! system_nr %in% metaecosystems_to_take_off)

Regional biomass

Medium-Medium vs Small-Large

We want to understand if the regional biomass produced by an ecosystem with a small and a large patch (metaecosystem_type = S_L) is lower than the regional biomass produced by an ecosystem with two medium patches (metaecosystem_type = M_M).

Plots

Let’s start by plotting the single ecosystems to see that everything is fine. To make the patterns clear let’s plot the low disturbance and high disturbance in two different figures. We first plot the single meta-ecosystems and then their box plots.

ds_regional %>%
    filter ( disturbance == "low") %>%
    filter (metaecosystem_type == "S_L" | 
              metaecosystem_type == "M_M") %>%
    ggplot (aes(x = day,
                y = regional_mean_bioarea,
                group = system_nr,
                fill = system_nr,
                color = system_nr,
                linetype = metaecosystem_type)) +
    geom_line () +
    labs(x = "Day", 
         y = "Regional bioarea (something/µl)",
         title = "Disturbance = low",
         fill = "System nr",
         color = "System nr",
         linetype = "") +
    scale_y_continuous(limits = c(0, 6250)) +
    scale_x_continuous(limits = c(-2, 30)) +
    scale_linetype_discrete(labels = c("medium-medium",
                                     "small-large")) + 
    theme_bw() +
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  geom_vline(xintercept = first_perturbation_day, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

ds_regional %>%
    filter ( disturbance == "high") %>%
    filter (metaecosystem_type == "S_L" | metaecosystem_type == "M_M") %>%
    ggplot (aes(x = day,
                y = regional_mean_bioarea,
                group = system_nr,
                fill = system_nr,
                color = system_nr,
                linetype = metaecosystem_type)) +
    geom_line () +
    labs(x = "Day", 
         y = "Regional bioarea (something/µl)",
         title = "Disturbance = high",
         fill = "System nr",
         color = "System nr",
         linetype = "") +
    scale_y_continuous(limits = c(0, 6250)) +
    scale_x_continuous(limits = c(-2, 30)) +
    scale_linetype_discrete(labels = c("medium-medium",
                                     "small-large")) + 
    theme_bw() +
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  geom_vline(xintercept = first_perturbation_day, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

p_regional_low_mean = ds_regional %>%
  filter(disturbance == "low") %>%
  filter (metaecosystem_type == "S_L" | metaecosystem_type == "M_M") %>%
  ggplot (aes(x = day,
              y = regional_mean_bioarea,
              group = interaction(day, metaecosystem_type),
              fill = metaecosystem_type)) +
  geom_boxplot() +
  labs(x = "Day", 
       y = "Regional bioarea (something/µl)",
       title = "Disturbance = low",
       color='', 
       fill='') +
  scale_fill_discrete(labels = c("medium-medium", 
                                 "small-large")) +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6))  +
  geom_vline(xintercept = first_perturbation_day + 0.7, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")
p_regional_low_mean

ds_regional %>%
  filter(disturbance == "high") %>%
  filter (metaecosystem_type == "S_L" | metaecosystem_type == "M_M") %>%
  ggplot (aes(x = day,
              y = regional_mean_bioarea,
              group = interaction (day, metaecosystem_type),
              fill = metaecosystem_type)) +
  geom_boxplot() +
  labs(x = "Day", 
       y = "Regional bioarea (something/µl)",
       title = "Disturbance = high",
       color='', 
       fill='') +
  scale_fill_discrete(labels = c("medium-medium", "small-large")) +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  geom_vline(xintercept = first_perturbation_day + 0.7, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

We can see that the regional biomass was higher for meta-ecosystems with the same size, regardless of disturbance. As both disturbance levels showed the same pattern, I would just keep the one with low disturbance in the publication, as it shows a stronger pattern.

Model time series
Time as random effect

Tidy

First of all, let’s modify the data set including the regional biomass of our meta-ecosystems. In this data set, we want to have the regional biomass of the meta-ecosystems (averaged first across videos and then across patches) in which we:

  • Include only the meta-ecosystems in which patches had both medium size (metaecosystem_type = M_M) and meta-ecosystems in which patches had one a small size and the other large size (metaecosystem_type = S_L).

  • Take off the first two point (day 0 and day = 4). This is because the first perturbation happened only at day 5.

ds_regional_MM_SL_t2t7 = ds_regional %>%
    filter (metaecosystem_type == "M_M" | metaecosystem_type == "S_L", 
            time_point >= 2)

Model selection

Let’s start from the largest mixed effect model.

full_model = lmer(regional_mean_bioarea ~ 
                             metaecosystem_type  + 
                             disturbance + 
                             metaecosystem_type : disturbance + 
                             (metaecosystem_type | day) + 
                             (disturbance | day) + 
                             (metaecosystem_type : disturbance  | day) +
                             (1 | system_nr) , 
                             data = ds_regional_MM_SL_t2t7, 
                             REML = FALSE)

Should we keep M * D?

no_MD = lmer(regional_mean_bioarea ~ 
                             metaecosystem_type  + 
                             disturbance + 
                             (metaecosystem_type | day) + 
                             (disturbance | day) + 
                             (1 | system_nr) , 
                             data = ds_regional_MM_SL_t2t7, 
                             REML = FALSE)

anova(full_model, no_MD)
## Data: ds_regional_MM_SL_t2t7
## Models:
## no_MD: regional_mean_bioarea ~ metaecosystem_type + disturbance + (metaecosystem_type | day) + (disturbance | day) + (1 | system_nr)
## full_model: regional_mean_bioarea ~ metaecosystem_type + disturbance + metaecosystem_type:disturbance + (metaecosystem_type | day) + (disturbance | day) + (metaecosystem_type:disturbance | day) + (1 | system_nr)
##            npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## no_MD        11 1861.7 1892.4 -919.86   1839.7                     
## full_model   27 1891.1 1966.4 -918.56   1837.1 2.5891 16     0.9999

No.

Should we keep the random slope of (M | day)?

no_M_day_slope = lmer(regional_mean_bioarea ~ 
                             metaecosystem_type  + 
                             disturbance + 
                             (disturbance | day) + 
                             (1 | system_nr) , 
                             data = ds_regional_MM_SL_t2t7, 
                             REML = FALSE)

anova(no_MD, no_M_day_slope)
## Data: ds_regional_MM_SL_t2t7
## Models:
## no_M_day_slope: regional_mean_bioarea ~ metaecosystem_type + disturbance + (disturbance | day) + (1 | system_nr)
## no_MD: regional_mean_bioarea ~ metaecosystem_type + disturbance + (metaecosystem_type | day) + (disturbance | day) + (1 | system_nr)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)   
## no_M_day_slope    8 1871.0 1893.2 -927.48   1855.0                        
## no_MD            11 1861.7 1892.4 -919.86   1839.7 15.239  3   0.001623 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Yes.

Should we keep the correlation between intercept and slope of (M | day)?

no_M_day_correlation = lmer(regional_mean_bioarea ~ 
                             metaecosystem_type  + 
                             disturbance + 
                             (metaecosystem_type || day) + 
                             (disturbance | day) + 
                             (1 | system_nr) , 
                             data = ds_regional_MM_SL_t2t7, 
                             REML = FALSE)

anova(no_MD, no_M_day_correlation)
## Data: ds_regional_MM_SL_t2t7
## Models:
## no_MD: regional_mean_bioarea ~ metaecosystem_type + disturbance + (metaecosystem_type | day) + (disturbance | day) + (1 | system_nr)
## no_M_day_correlation: regional_mean_bioarea ~ metaecosystem_type + disturbance + ((1 | day) + (0 + metaecosystem_type | day)) + (disturbance | day) + (1 | system_nr)
##                      npar    AIC    BIC  logLik deviance Chisq Df Pr(>Chisq)
## no_MD                  11 1861.7 1892.4 -919.86   1839.7                    
## no_M_day_correlation   12 1863.7 1897.2 -919.86   1839.7     0  1     0.9996

Yes.

Should we keep the random slope of (D| day)?

no_D_day_slope = lmer(regional_mean_bioarea ~ 
                             metaecosystem_type  + 
                             disturbance + 
                             (metaecosystem_type | day) + 
                             (1 | system_nr) , 
                             data = ds_regional_MM_SL_t2t7, 
                             REML = FALSE)

anova(no_MD, no_M_day_slope)
## Data: ds_regional_MM_SL_t2t7
## Models:
## no_M_day_slope: regional_mean_bioarea ~ metaecosystem_type + disturbance + (disturbance | day) + (1 | system_nr)
## no_MD: regional_mean_bioarea ~ metaecosystem_type + disturbance + (metaecosystem_type | day) + (disturbance | day) + (1 | system_nr)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)   
## no_M_day_slope    8 1871.0 1893.2 -927.48   1855.0                        
## no_MD            11 1861.7 1892.4 -919.86   1839.7 15.239  3   0.001623 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Yes.

Should we keep the correlation between intercept and slope of (D | day)?

no_D_day_correlation = lmer(regional_mean_bioarea ~ 
                             metaecosystem_type  + 
                             disturbance + 
                             (metaecosystem_type | day) + 
                             (disturbance || day) + 
                             (1 | system_nr) , 
                             data = ds_regional_MM_SL_t2t7, 
                             REML = FALSE)

anova(no_MD, no_D_day_correlation)
## Data: ds_regional_MM_SL_t2t7
## Models:
## no_MD: regional_mean_bioarea ~ metaecosystem_type + disturbance + (metaecosystem_type | day) + (disturbance | day) + (1 | system_nr)
## no_D_day_correlation: regional_mean_bioarea ~ metaecosystem_type + disturbance + (metaecosystem_type | day) + ((1 | day) + (0 + disturbance | day)) + (1 | system_nr)
##                      npar    AIC    BIC  logLik deviance Chisq Df Pr(>Chisq)
## no_MD                  11 1861.7 1892.4 -919.86   1839.7                    
## no_D_day_correlation   12 1863.7 1897.2 -919.86   1839.7     0  1     0.9996

No.

Should we keep (1 | system_nr)?

no_system_nr = lmer(regional_mean_bioarea ~ 
                             metaecosystem_type  + 
                             disturbance + 
                             (metaecosystem_type | day) + 
                             (disturbance || day), 
                             data = ds_regional_MM_SL_t2t7, 
                             REML = FALSE)

anova(no_D_day_correlation, no_system_nr)
## Data: ds_regional_MM_SL_t2t7
## Models:
## no_system_nr: regional_mean_bioarea ~ metaecosystem_type + disturbance + (metaecosystem_type | day) + ((1 | day) + (0 + disturbance | day))
## no_D_day_correlation: regional_mean_bioarea ~ metaecosystem_type + disturbance + (metaecosystem_type | day) + ((1 | day) + (0 + disturbance | day)) + (1 | system_nr)
##                      npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## no_system_nr           11 1869.2 1899.9 -923.60   1847.2                     
## no_D_day_correlation   12 1863.7 1897.2 -919.86   1839.7 7.4907  1   0.006202
##                        
## no_system_nr           
## no_D_day_correlation **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Yes.

Best model

Therefore, the best model is

\[ Regional \: bioarea = M + D + (M | t) + (D || t) \]

This model is the best model when looking at all time points coming after the first disturbance event (t2->t7). Assuming that this model holds for also other sections of the time series, the r squared of the model and of meta-ecosystem type is as follows.

#Create a table with all the models in which time is a random effect. 

### --- INITIALISE TABLE --- ###

columns = c("model", "time_point", "AIC", "BIC", "R2_mixed", "R2_fixed", "R2_mixed_M", "R2_fixed_M")
random_time_table = data.frame(matrix(ncol = length(columns),
                                     nrow = 0))
colnames(random_time_table) = columns

### --- M + D + (M | t) + (D || t) --- ###

for (last_point in 3:7) {
  
  full_model = lmer(regional_mean_bioarea ~ 
                      disturbance +
                      metaecosystem_type +
                      (metaecosystem_type | day) +
                      (1 | system_nr),
                    data = filter(ds_regional_MM_SL_t2t7, 
                                  time_point <= last_point),
                    REML = FALSE,
                    control = lmerControl(optimizer ="Nelder_Mead"))
  
  null_model = lm(regional_mean_bioarea ~ 
                    1 , 
                  data = filter(ds_regional_MM_SL_t2t7, time_point <= last_point))
   
  metaeco_null = lmer(regional_mean_bioarea ~  
                        disturbance  + 
                        (1 | day) +
                        (1 | system_nr),
                      data = filter(ds_regional_MM_SL_t2t7, 
                                    time_point <= last_point),
                      REML = FALSE, 
                      control = lmerControl(optimizer ="Nelder_Mead"))
  
  random_time_table = update_all_models_table("M+D+(M|t)+(D||t)",
                                             random_time_table, 
                                             full_model, 
                                             null_model,
                                             metaeco_null,
                                             "mixed")
}

datatable(random_time_table, 
          rownames = FALSE,
          options = list(pageLength = 100,
                         scrollX = TRUE,
                         autoWidth = TRUE,
                         columnDefs = list(list(targets=c(0),visible=TRUE, width='160'),
                                           list(targets=c(1), visible=TRUE, width='10'),
                                           list(targets=c(2), visible=TRUE, width='10'),
                                           list(targets=c(3), visible=TRUE, width='10'),
                                           list(targets=c(4), visible=TRUE, width='10'),
                                           list(targets=c(5), visible=TRUE, width='10'),
                                           list(targets=c(6), visible=TRUE, width='10'),
                                           list(targets=c(7), visible=TRUE, width='10'),
                                           list(targets='_all', visible=FALSE))),
          caption = "
          M = Meta-ecosystem type, 
          D = disturbance, 
          t = time,
          (M | t) = random effect of time on the intercept and slope of M,
          (D || t) = random effect of time on the intercept and slope of D, 
          || = no correlation between intercept and slope,
          | = correlation between intercept and slope,
          R2_mixed = r squared of the model,
          R2_fixed = r squared of the model when considering only fixed effects,
          R2_mixed_M = r squared of meta-ecosystem type,
          R2_fixed_M = r squared of meta-ecosystem type when considerin only fixed effects")

How is it possible that the mixed effect of M is higher than its fixed effects?

Time as fixed effect

Here we want to see whether we can include time as a fixed effect by transforming the bioarea into its logarithm with base 10. As we are not considering the first two time points because they were before the first disturbance (we want to see the effects of meta-ecosystem type under disturbance), we are lucky that there is a better chance that the biomass will look like goes down in a linear fashion.

Regional bioarea ~ time

ds_regional %>%
  filter(time_point >= 2) %>%
  ggplot(aes(x = day,
             y = regional_mean_bioarea,
             group = day)) +
  geom_boxplot() +
  labs(title = "Without log transformation",
       x = "Day",
       y = "Regional bioarea (something/µl)")

Let’s check how linear the relationship is.

linear_model = lm(regional_mean_bioarea ~ 
                    day, 
                  data = ds_regional %>% 
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

par(mfrow=c(2,3))
plot(linear_model, which = 1:5)

I’m not 100% convinced, as the residuals vs fitted has a bit of a banana shape line.

Log Regional bioarea ~ time

ds_regional %>%
  filter(time_point >= 2) %>%
  ggplot(aes(x = day,
             y = log(regional_mean_bioarea + 1),
             group = day)) +
  geom_boxplot() +
  labs(title = "With log transformation",
       x = "Day",
       y = "Log (regional bioarea + 1) (something/µl)")

Let’s now check how linear these two relationships are.

log_linear_model = lm(log10(regional_mean_bioarea + 1) ~ 
                    day, 
                  data = ds_regional %>% 
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"))

par(mfrow=c(2,3))
plot(log_linear_model, which = 1:5)
par(mfrow=c(1,1))

Way better. Especially the residuals vs fitted values plot doesn’t have a banana shape anymore. This seems to be a good model. Let’s then keep the log transformed bioarea.

Model selection

Let’ start from the full model.

full = lmer(log10(regional_mean_bioarea + 1) ~
                     day * metaecosystem_type * disturbance +
                     (day | system_nr),
                     data = ds_regional %>%
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
                   REML = FALSE,
                   control = lmerControl(optimizer = "Nelder_Mead"))

Should we keep the correlation in (day | system_nr)?

no_correlation = lmer(log10(regional_mean_bioarea + 1) ~
                     day * metaecosystem_type * disturbance +
                     (day | system_nr),
                     data = ds_regional %>%
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
                   REML = FALSE,
                   control = lmerControl(optimizer = "Nelder_Mead"))

anova(full, no_correlation)
## Data: ds_regional %>% filter(time_point >= 2) %>% filter(metaecosystem_type ==  ...
## Models:
## full: log10(regional_mean_bioarea + 1) ~ day * metaecosystem_type * disturbance + (day | system_nr)
## no_correlation: log10(regional_mean_bioarea + 1) ~ day * metaecosystem_type * disturbance + (day | system_nr)
##                npar     AIC     BIC logLik deviance Chisq Df Pr(>Chisq)
## full             12 -168.78 -135.33 96.389  -192.78                    
## no_correlation   12 -168.78 -135.33 96.389  -192.78     0  0

Yes.

Should we keep t * M * D?

no_threeway = lmer(log10(regional_mean_bioarea + 1) ~
                     day +
                     metaecosystem_type +
                     disturbance +
                     day : metaecosystem_type + 
                     day : disturbance +
                     metaecosystem_type : disturbance + 
                     (day | system_nr),
                     data = ds_regional %>%
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
                   REML = FALSE,
                   control = lmerControl(optimizer = 'optimx', 
                                         optCtrl = list(method = 'L-BFGS-B')))

anova(full, no_threeway)
## Data: ds_regional %>% filter(time_point >= 2) %>% filter(metaecosystem_type ==  ...
## Models:
## no_threeway: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + day:disturbance + metaecosystem_type:disturbance + (day | system_nr)
## full: log10(regional_mean_bioarea + 1) ~ day * metaecosystem_type * disturbance + (day | system_nr)
##             npar     AIC     BIC logLik deviance  Chisq Df Pr(>Chisq)
## no_threeway   11 -170.66 -140.00 96.330  -192.66                     
## full          12 -168.78 -135.33 96.389  -192.78 0.1182  1     0.7309

No.

Should we keep t * M?

no_TM = lmer(log10(regional_mean_bioarea + 1) ~
                     day +
                     metaecosystem_type +
                     disturbance +
                     day : disturbance +
                     metaecosystem_type : disturbance + 
                     (day | system_nr),
                     data = ds_regional %>%
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
                   REML = FALSE,
                   control = lmerControl(optimizer = "Nelder_Mead"))

anova(no_threeway,no_TM)
## Data: ds_regional %>% filter(time_point >= 2) %>% filter(metaecosystem_type ==  ...
## Models:
## no_TM: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:disturbance + metaecosystem_type:disturbance + (day | system_nr)
## no_threeway: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:metaecosystem_type + day:disturbance + metaecosystem_type:disturbance + (day | system_nr)
##             npar     AIC     BIC logLik deviance  Chisq Df Pr(>Chisq)
## no_TM         10 -172.62 -144.74 96.308  -192.62                     
## no_threeway   11 -170.66 -140.00 96.330  -192.66 0.0431  1     0.8356

No.

Should we keep t * D?

no_TD = lmer(log10(regional_mean_bioarea + 1) ~
                     day +
                     metaecosystem_type +
                     disturbance +
                     metaecosystem_type : disturbance + 
                     (day | system_nr),
                     data = ds_regional %>%
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
                   REML = FALSE,
                   control = lmerControl(optimizer = "Nelder_Mead"))
anova(no_TM, no_TD)
## Data: ds_regional %>% filter(time_point >= 2) %>% filter(metaecosystem_type ==  ...
## Models:
## no_TD: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + metaecosystem_type:disturbance + (day | system_nr)
## no_TM: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:disturbance + metaecosystem_type:disturbance + (day | system_nr)
##       npar     AIC     BIC logLik deviance  Chisq Df Pr(>Chisq)    
## no_TD    9 -163.54 -138.46 90.771  -181.54                         
## no_TM   10 -172.62 -144.74 96.308  -192.62 11.074  1  0.0008756 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

No.

Should we keep M * D?

no_MD = lmer(log10(regional_mean_bioarea + 1) ~
                     day +
                     metaecosystem_type +
                     disturbance +
                     day : disturbance +
                     (day | system_nr),
                     data = ds_regional %>%
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
                   REML = FALSE,
                   control = lmerControl(optimizer = "Nelder_Mead"))

anova(no_TM, no_MD)
## Data: ds_regional %>% filter(time_point >= 2) %>% filter(metaecosystem_type ==  ...
## Models:
## no_MD: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:disturbance + (day | system_nr)
## no_TM: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:disturbance + metaecosystem_type:disturbance + (day | system_nr)
##       npar     AIC     BIC logLik deviance  Chisq Df Pr(>Chisq)
## no_MD    9 -173.87 -148.78 95.932  -191.87                     
## no_TM   10 -172.62 -144.74 96.308  -192.62 0.7513  1     0.3861

No.

Should we keep the random effect of system nr on the time slopes (day | system_nr)?

no_random_slopes = lmer(log10(regional_mean_bioarea + 1) ~
                     day +
                     metaecosystem_type +
                     disturbance +
                     day : disturbance +
                     (1 | system_nr),
                     data = ds_regional %>%
                            filter(time_point >= 2) %>%
                            filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L"),
                   REML = FALSE,
                   control = lmerControl(optimizer = "Nelder_Mead"))

anova(no_MD, no_random_slopes)
## Data: ds_regional %>% filter(time_point >= 2) %>% filter(metaecosystem_type ==  ...
## Models:
## no_random_slopes: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:disturbance + (1 | system_nr)
## no_MD: log10(regional_mean_bioarea + 1) ~ day + metaecosystem_type + disturbance + day:disturbance + (day | system_nr)
##                  npar     AIC     BIC logLik deviance  Chisq Df Pr(>Chisq)  
## no_random_slopes    7 -173.23 -153.71 93.613  -187.23                       
## no_MD               9 -173.87 -148.78 95.932  -191.87 4.6394  2     0.0983 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Yes.

Best model

Therefore, our best model is:

\[ log_{10}(regional \: bioarea + 1) = t + M + D + t*D + (t| system \: nr) \]

This model is the best model when looking at all time points coming after the first disturbance event (t2->t7). Assuming that this model holds for also other sections of the time series, the r squared of the model and of meta-ecosystem type is as follows.

#Create a table in which the regional biomass has been log transformed. 

### --- INITIALISE TABLE --- ###

columns = c("model", "time_point", "AIC", "BIC", "R2_mixed", "R2_fixed", "R2_mixed_M", "R2_fixed_M")
log_time_table = data.frame(matrix(ncol = length(columns), nrow = 0))
colnames(log_time_table) = columns

### --- POPULATE THE TABLE --- ###

for (last_point in 4:7) {
  
  full_model = lmer(log10(regional_mean_bioarea + 1) ~
                     day +
                     metaecosystem_type +
                     disturbance +
                     day : disturbance +
                     (day | system_nr),
                     data = ds_regional %>%
                            filter(time_point >= 2) %>%
                            filter(time_point <= last_point) %>%
                            filter(metaecosystem_type == "M_M" | 
                                   metaecosystem_type == "S_L"),
                    REML = FALSE,
                    control = lmerControl(optimizer = "Nelder_Mead"))

  
  null_model = lm(regional_mean_bioarea ~ 
                    1 , 
                  data = ds_regional %>%
                            filter(time_point >= 2) %>%
                            filter(time_point <= last_point) %>%
                            filter(metaecosystem_type == "M_M" | 
                                   metaecosystem_type == "S_L"))
  
  metaeco_null_model = lmer(log10(regional_mean_bioarea + 1) ~
                              day +
                              disturbance +
                              day : disturbance +
                              (day | system_nr),
                            data = ds_regional %>%
                              filter(time_point >= 2) %>%
                              filter(time_point <= last_point) %>%
                              filter(metaecosystem_type == "M_M" | 
                                     metaecosystem_type == "S_L"),
                            REML = FALSE,
                            control = lmerControl(optimizer = "Nelder_Mead"))
  
  log_time_table = update_all_models_table("t + M + D + t * M * D + (t || system_nr)",
                                             log_time_table, 
                                             full_model, 
                                             null_model,
                                             metaeco_null_model,
                                             "mixed")
}

datatable(log_time_table, 
          rownames = FALSE,
          options = list(pageLength = 100,
                         scrollX = TRUE,
                         autoWidth = TRUE,
                         columnDefs = list(list(targets=c(0),visible=TRUE, width='160'),
                                           list(targets=c(1), visible=TRUE, width='10'),
                                           list(targets=c(2), visible=TRUE, width='10'),
                                           list(targets=c(3), visible=TRUE, width='10'),
                                           list(targets=c(4), visible=TRUE, width='10'),
                                           list(targets=c(5), visible=TRUE, width='10'),
                                           list(targets=c(6), visible=TRUE, width='10'),
                                           list(targets=c(7), visible=TRUE, width='10'),
                                           list(targets='_all', visible=FALSE))),
          caption = "
          M = Meta-ecosystem type, 
          D = disturbance, 
          (1 | t) = random effect of time on the intercept,
          (1 | ID) = random effect of meta-ecosystem ID on the intercept, 
          || = no correlation between intercept and slope,
          | = correlation between intercept and slope,
          R2 = r squared of the whole model,
          R2_fixed = fixed part of the mixed model,
          mixed_R2 = r squared when considering both fixed and random effects (conditional r squared), 
          fixed_R2 = r squared when considering only the fixed effects (marginal r squared)")

Notice that I did not include the t3-t4 time series, as when running the model it gives me the following error:

  • Error: number of observations (=40) <= number of random effects (=40) for term (day | system_nr); the random-effects parameters and the residual variance (or scale parameter) are probably unidentifiable
Fitted time function

Here we want to fit how biomass changes across time to a function. The biomass of our meta-ecosystems looks like this.

ds_regional %>%
  filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L") %>%
  ggplot(aes(x = day,
             y = regional_mean_bioarea,
             group = day)) + 
  geom_boxplot() +
  labs(x = "day", y = "Regional bioarea (something/microlitres)")  +
  geom_vline(xintercept = first_perturbation_day + 0.7, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

To fit these data, we need to produce (and parameterise) a function that resemble how the biomass first increases and then decreases.

Hank produced the following function:

\[biomass = a_4 * (day - a_5) * e^{a_1(day - a_5)}\]

If we parameterise the function and then plot, it looks as follows.

a1 = -0.1
a4 = 1200
a5 = -1

day = seq(0, 30, 0.01)
biomass = a4*(day-a5) * exp(a1*(day-a5))
plot(biomass ~ day)

Now, let’s find the best parameters (a1, a4, a5) that fit our data.

ds_regional_shrunk_type = ds_regional %>%
    filter(metaecosystem_type == "M_M" | metaecosystem_type == "S_L")

model = nls(regional_mean_bioarea ~ a4 * (day-a5) * exp(a1 * (day-a5)), 
            start = list(a1 = -0.1, a4 = 1200, a5 = -1),
            trace = T,
            data = ds_regional_shrunk_type)

a1 = as.numeric(model$m$getPars()[1])
a4 = as.numeric(model$m$getPars()[2])
a5 = as.numeric(model$m$getPars()[3])
model$m$getPars()
##           a1           a4           a5 
##   -0.1336182 1216.4268967   -1.6147345

And now let’s plot the function to see how it fits our data.

day = seq(0,30,1)
predicted = a4*(day-a5)*exp(a1*(day-a5))
data_fitted=data.frame(day=day,regional_mean_bioarea=predicted)

ds_regional_shrunk_type%>%
  ggplot(aes(x = day,
             y = regional_mean_bioarea,
             group = day)) + 
  geom_boxplot() +
  labs(x = "Day", y = "Regional bioarea") +
  geom_line(data=data_fitted,aes(x = day, y=regional_mean_bioarea),color="red", group = 1) +
  geom_vline(xintercept = first_perturbation_day + 0.7, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

Let’s then include our predictions into the data set.

ds_regional_predicted_shrunk_type = ds_regional %>%
  mutate(predicted_from_time = a4*(day-a5)*exp(a1*(day-a5))) %>%
  filter(metaecosystem_type == "S_L" | metaecosystem_type == "M_M")

Let’s now work using the fitted data that we found in the previous section.

Tidy

ds_regional_predicted_shrunk_type_n_day = ds_regional_predicted_shrunk_type %>%
  filter(time_point >= 2)

Model selection
Let’s start from the largest mixed effect model.

full = lmer(log10(regional_mean_bioarea + 1) ~
              predicted_from_time * metaecosystem_type * disturbance +
              (predicted_from_time | system_nr),
            data = ds_regional_predicted_shrunk_type_n_day,
            REML = FALSE,
            control = lmerControl(optimizer = "Nelder_Mead"))
## Warning: Some predictor variables are on very different scales: consider
## rescaling

Should we keep the correlation in (day | system_nr)?

no_correlation = lmer(log10(regional_mean_bioarea + 1) ~
              predicted_from_time * metaecosystem_type * disturbance +
              (predicted_from_time || system_nr),
            data = ds_regional_predicted_shrunk_type_n_day,
            REML = FALSE,
            control = lmerControl(optimizer = "Nelder_Mead"))
## Warning: Some predictor variables are on very different scales: consider
## rescaling
anova(full, no_correlation)
## Data: ds_regional_predicted_shrunk_type_n_day
## Models:
## no_correlation: log10(regional_mean_bioarea + 1) ~ predicted_from_time * metaecosystem_type * disturbance + ((1 | system_nr) + (0 + predicted_from_time | system_nr))
## full: log10(regional_mean_bioarea + 1) ~ predicted_from_time * metaecosystem_type * disturbance + (predicted_from_time | system_nr)
##                npar     AIC     BIC logLik deviance  Chisq Df Pr(>Chisq)  
## no_correlation   11 -158.51 -127.85 90.257  -180.51                       
## full             12 -161.16 -127.71 92.581  -185.16 4.6486  1    0.03108 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Keep full

Yes.

Should we keep t * M * D?

no_TMD = lmer(log10(regional_mean_bioarea + 1) ~
              predicted_from_time +
              metaecosystem_type +
              disturbance +
              predicted_from_time : metaecosystem_type +
              predicted_from_time : disturbance +
              metaecosystem_type : disturbance +
              (predicted_from_time | system_nr),
            data = ds_regional_predicted_shrunk_type_n_day,
            REML = FALSE,
            control = lmerControl(optimizer = "Nelder_Mead"))
## Warning: Some predictor variables are on very different scales: consider
## rescaling
anova(full, no_TMD)
## Data: ds_regional_predicted_shrunk_type_n_day
## Models:
## no_TMD: log10(regional_mean_bioarea + 1) ~ predicted_from_time + metaecosystem_type + disturbance + predicted_from_time:metaecosystem_type + predicted_from_time:disturbance + metaecosystem_type:disturbance + (predicted_from_time | system_nr)
## full: log10(regional_mean_bioarea + 1) ~ predicted_from_time * metaecosystem_type * disturbance + (predicted_from_time | system_nr)
##        npar     AIC     BIC logLik deviance  Chisq Df Pr(>Chisq)
## no_TMD   11 -163.07 -132.41 92.538  -185.07                     
## full     12 -161.16 -127.71 92.581  -185.16 0.0874  1     0.7675
#Keep no_TMD

No.

Should we keep t * M?

no_TM = lmer(log10(regional_mean_bioarea + 1) ~
              predicted_from_time +
              metaecosystem_type +
              disturbance +
              predicted_from_time : disturbance +
              metaecosystem_type : disturbance +
              (predicted_from_time | system_nr),
            data = ds_regional_predicted_shrunk_type_n_day,
            REML = FALSE,
            control = lmerControl(optimizer = "Nelder_Mead"))
## Warning: Some predictor variables are on very different scales: consider
## rescaling
anova(no_TMD, no_TM)
## Data: ds_regional_predicted_shrunk_type_n_day
## Models:
## no_TM: log10(regional_mean_bioarea + 1) ~ predicted_from_time + metaecosystem_type + disturbance + predicted_from_time:disturbance + metaecosystem_type:disturbance + (predicted_from_time | system_nr)
## no_TMD: log10(regional_mean_bioarea + 1) ~ predicted_from_time + metaecosystem_type + disturbance + predicted_from_time:metaecosystem_type + predicted_from_time:disturbance + metaecosystem_type:disturbance + (predicted_from_time | system_nr)
##        npar     AIC     BIC logLik deviance  Chisq Df Pr(>Chisq)
## no_TM    10 -165.06 -137.18 92.528  -185.06                     
## no_TMD   11 -163.07 -132.41 92.538  -185.07 0.0194  1     0.8893
#Keep no_TM

No.

Should we keep t * D?

no_TD = lmer(log10(regional_mean_bioarea + 1) ~
              predicted_from_time +
              metaecosystem_type +
              disturbance +
              metaecosystem_type : disturbance +
              (predicted_from_time | system_nr),
            data = ds_regional_predicted_shrunk_type_n_day,
            REML = FALSE,
            control = lmerControl(optimizer = "Nelder_Mead"))

anova(no_TM, no_TD)
## Data: ds_regional_predicted_shrunk_type_n_day
## Models:
## no_TD: log10(regional_mean_bioarea + 1) ~ predicted_from_time + metaecosystem_type + disturbance + metaecosystem_type:disturbance + (predicted_from_time | system_nr)
## no_TM: log10(regional_mean_bioarea + 1) ~ predicted_from_time + metaecosystem_type + disturbance + predicted_from_time:disturbance + metaecosystem_type:disturbance + (predicted_from_time | system_nr)
##       npar     AIC     BIC logLik deviance  Chisq Df Pr(>Chisq)   
## no_TD    9 -156.84 -131.75 87.420  -174.84                        
## no_TM   10 -165.06 -137.18 92.528  -185.06 10.216  1   0.001392 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Keep no_TM

Yes.

Should we keep M * D?

no_MD = lmer(log10(regional_mean_bioarea + 1) ~
              predicted_from_time +
              metaecosystem_type +
              disturbance +
              predicted_from_time : disturbance +
              (predicted_from_time | system_nr),
            data = ds_regional_predicted_shrunk_type_n_day,
            REML = FALSE,
            control = lmerControl(optimizer = "Nelder_Mead"))
## Warning: Some predictor variables are on very different scales: consider
## rescaling
anova(no_TM, no_MD)
## Data: ds_regional_predicted_shrunk_type_n_day
## Models:
## no_MD: log10(regional_mean_bioarea + 1) ~ predicted_from_time + metaecosystem_type + disturbance + predicted_from_time:disturbance + (predicted_from_time | system_nr)
## no_TM: log10(regional_mean_bioarea + 1) ~ predicted_from_time + metaecosystem_type + disturbance + predicted_from_time:disturbance + metaecosystem_type:disturbance + (predicted_from_time | system_nr)
##       npar     AIC     BIC logLik deviance  Chisq Df Pr(>Chisq)
## no_MD    9 -166.26 -141.17 92.129  -184.26                     
## no_TM   10 -165.06 -137.18 92.528  -185.06 0.7978  1     0.3717
#Keep no_MD

No.

Should we keep the random effect of system nr on the time slopes (day | system_nr)?

no_random_slopes = lmer(log10(regional_mean_bioarea + 1) ~
              predicted_from_time +
              metaecosystem_type +
              disturbance +
              predicted_from_time : disturbance +
              (1 | system_nr),
            data = ds_regional_predicted_shrunk_type_n_day,
            REML = FALSE,
            control = lmerControl(optimizer = "Nelder_Mead"))
## Warning: Some predictor variables are on very different scales: consider
## rescaling
anova(no_MD, no_random_slopes)
## Data: ds_regional_predicted_shrunk_type_n_day
## Models:
## no_random_slopes: log10(regional_mean_bioarea + 1) ~ predicted_from_time + metaecosystem_type + disturbance + predicted_from_time:disturbance + (1 | system_nr)
## no_MD: log10(regional_mean_bioarea + 1) ~ predicted_from_time + metaecosystem_type + disturbance + predicted_from_time:disturbance + (predicted_from_time | system_nr)
##                  npar     AIC     BIC logLik deviance  Chisq Df Pr(>Chisq)  
## no_random_slopes    7 -165.49 -145.98 89.746  -179.49                       
## no_MD               9 -166.26 -141.17 92.129  -184.26 4.7662  2    0.09226 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Keep no_MD

Yes.

Best model

Therefore, our best model is:

\[ log_{10}(regional \: bioarea + 1) = t + M + D + t*D + (t| system \: nr) \]

This model is the best model when looking at all time points coming after the first disturbance event (t2->t7). Assuming that this model holds for also other sections of the time series, the r squared of the model and of meta-ecosystem type is as follows.

#Create a table in which time is a fixed effect. 

### --- INITIALISE TABLE --- ###
columns = c("model", "time_point", "AIC", "BIC", "R2_mixed", "R2_fixed", "R2_mixed_M", "R2_fixed_M")
fitted_time_table = data.frame(matrix(ncol = length(columns), nrow = 0))
colnames(fitted_time_table) = columns

### --- POPULATE TABLE --- ###

for (last_point in 4:7) {
  
  full_model = lmer(log10(regional_mean_bioarea + 1) ~
                     predicted_from_time +
                     metaecosystem_type +
                     disturbance +
                     predicted_from_time : disturbance +
                     (predicted_from_time | system_nr),
                     data = ds_regional_predicted_shrunk_type_n_day %>%
                            filter(time_point <= last_point),
                    REML = FALSE,
                    control = lmerControl(optimizer = "Nelder_Mead"))

  null_model = lm(regional_mean_bioarea ~ 
                    1 , 
                  data = ds_regional_predicted_shrunk_type_n_day %>%
                            filter(time_point <= last_point))
  
  metaeco_null_model = lmer(log10(regional_mean_bioarea + 1) ~
                              predicted_from_time +
                              disturbance +
                              predicted_from_time : disturbance +
                              (predicted_from_time | system_nr),
                            data = ds_regional_predicted_shrunk_type_n_day %>%
                              filter(time_point <= last_point),
                            REML = FALSE,
                            control = lmerControl(optimizer = "Nelder_Mead"))
  
  fitted_time_table = update_all_models_table("Tp + M + D + Tp * D + (Tp | system_nr)",
                                             fitted_time_table, 
                                             full_model, 
                                             null_model,
                                             metaeco_null_model,
                                             "mixed")
}

datatable(fitted_time_table, 
          rownames = FALSE,
          options = list(pageLength = 100,
                         scrollX = TRUE,
                         autoWidth = TRUE,
                         columnDefs = list(list(targets=c(0),visible=TRUE, width='160'),
                                           list(targets=c(1), visible=TRUE, width='10'),
                                           list(targets=c(2), visible=TRUE, width='10'),
                                           list(targets=c(3), visible=TRUE, width='10'),
                                           list(targets=c(4), visible=TRUE, width='10'),
                                           list(targets=c(5), visible=TRUE, width='10'),
                                           list(targets=c(6), visible=TRUE, width='10'),
                                           list(targets=c(7), visible=TRUE, width='10'),
                                           list(targets='_all', visible=FALSE))),
          caption = "
          Tp = predicted from time,
          M = Meta-ecosystem type, 
          D = disturbance, 
          (1 | t) = random effect of time on the intercept,
          (1 | ID) = random effect of meta-ecosystem ID on the intercept, 
          || = no correlation between intercept and slope,
          | = correlation between intercept and slope,
          R2 = r squared of the whole model,
          R2_fixed = fixed part of the mixed model,
          mixed_R2 = r squared when considering both fixed and random effects (conditional r squared), 
          fixed_R2 = r squared when considering only the fixed effects (marginal r squared)")
Models table
#Table with all the different models
full_table = rbind(random_time_table, fitted_time_table, log_time_table)

datatable(full_table, 
          rownames = FALSE,
          options = list(pageLength = 100,
                         scrollX = TRUE,
                         autoWidth = TRUE,
                         columnDefs = list(list(targets=c(0),visible=TRUE, width='160'),
                                           list(targets=c(1), visible=TRUE, width='10'),
                                           list(targets=c(2), visible=TRUE, width='10'),
                                           list(targets=c(3), visible=TRUE, width='10'),
                                           list(targets=c(4), visible=TRUE, width='10'),
                                           list(targets=c(5), visible=TRUE, width='10'),
                                           list(targets=c(6), visible=TRUE, width='10'),
                                           list(targets=c(7), visible=TRUE, width='10'),
                                           list(targets='_all', visible=FALSE))),
          caption = "
          M = Meta-ecosystem type, 
          D = disturbance, 
          (1 | t) = random effect of time on the intercept,
          (1 | ID) = random effect of meta-ecosystem ID on the intercept, 
          || = no correlation between intercept and slope,
          | = correlation between intercept and slope,
          R2 = r squared of the whole model,
          R2_fixed = fixed part of the mixed model,
          mixed_R2 = r squared when considering both fixed and random effects (conditional r squared), 
          fixed_R2 = r squared when considering only the fixed effects (marginal r squared)")
Model single points

Model selection

I’ll do model selection only on time point number 3 (however, I have done it also with time point 4,5,6,7 and they all give me the same result). Let’s start from the full model.

I can’t construct it from a mixed model because the following error pops up:

  • Error: number of levels of each grouping factor must be < number of observations (problems: system_nr)
full = lm(regional_mean_bioarea ~
              metaecosystem_type +
              disturbance +
              disturbance : metaecosystem_type,
            data = ds_regional %>%
              filter(time_point == 3) %>%
              filter(metaecosystem_type == "M_M" |
                     metaecosystem_type == "S_L"))

Should we keep D * M?

no_MD = lm(regional_mean_bioarea ~
              metaecosystem_type +
              disturbance,
            data = ds_regional %>%
              filter(time_point == 3) %>%
              filter(metaecosystem_type == "M_M" |
                     metaecosystem_type == "S_L"))

anova(full, no_MD)
## Analysis of Variance Table
## 
## Model 1: regional_mean_bioarea ~ metaecosystem_type + disturbance + disturbance:metaecosystem_type
## Model 2: regional_mean_bioarea ~ metaecosystem_type + disturbance
##   Res.Df     RSS Df Sum of Sq      F Pr(>F)
## 1     16 3433038                           
## 2     17 3732239 -1   -299201 1.3945 0.2549
AIC(full, no_MD)
##       df      AIC
## full   5 307.8220
## no_MD  4 307.4933

No.

Best model

Therefore, our best model is

\[ Regional \: Bioarea = M + D \]

columns = c("time_point", "R2", "R2_M")
single_points = matrix(ncol = length(columns), 
                       nrow = 7)
single_points = as.data.frame(single_points)
colnames(single_points) = columns

for (t in 1:7) {
  
  full_model = lm(regional_mean_bioarea ~ 
               disturbance +
               metaecosystem_type,
             data = ds_regional %>%
               filter(metaecosystem_type == "M_M" |
                      metaecosystem_type == "S_L") %>%
               filter(time_point == t))

  no_M_model = lm(regional_mean_bioarea ~ 
              disturbance,
            data = ds_regional %>%
              filter(metaecosystem_type == "M_M" |
                      metaecosystem_type == "S_L") %>%
              filter(time_point == t))
  
  R2_full_model = summary(full_model)$adj.r.squared
  R2_no_M_model = summary(no_M_model)$adj.r.squared
  R2_M = R2_full_model - R2_no_M_model
  
  single_points$time_point[t] = t
  single_points$R2[t] = R2_full_model
  single_points$R2_M[t] = R2_M
  
}

single_points = round(single_points, digits = 2)
single_points = single_points[2:nrow(single_points),]
datatable(single_points,
          rownames = FALSE,
          colnames = c("Time point", "R2 model", "R2 meta-ecosystem type"))

Meta-ecosystems of different total size

ds_regional %>%
  filter(!metaecosystem_type == "S_L") %>%
  filter ( disturbance == "low") %>%
  ggplot (aes(x = day,
                y = regional_mean_bioarea,
                group = system_nr,
                fill = system_nr,
              color = system_nr,
                linetype = metaecosystem_type)) +
    geom_line () +
    labs(x = "Day", 
         y = "Regional bioarea (something/µl)",
         title = "Disturbance = low",
         fill = "System nr",
         linetype = "") +
    scale_y_continuous(limits = c(0, 6250)) +
    scale_x_continuous(limits = c(-2, 30)) +
  scale_colour_continuous(guide = "none") +
    theme_bw() +
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("large-large",
                                     "medium-medium",
                                     "small-small"))  +
  geom_vline(xintercept = first_perturbation_day, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

ds_regional %>%
  filter(!metaecosystem_type == "S_L") %>%
  filter ( disturbance == "high") %>%
  ggplot (aes(x = day,
                y = regional_mean_bioarea,
                group = system_nr,
                fill = system_nr,
              color = system_nr,
                linetype = metaecosystem_type)) +
    geom_line () +
    labs(x = "Day", 
         y = "Regional bioarea (something/µl)",
         title = "Disturbance = high",
         fill = "System nr",
         linetype = "") +
    scale_y_continuous(limits = c(0, 6250)) +
    scale_x_continuous(limits = c(-2, 30)) +
    scale_colour_continuous(guide = "none") +
    theme_bw() +
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("large-large",
                                     "medium-medium",
                                     "small-small"))  +
  geom_vline(xintercept = first_perturbation_day, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

ds_regional %>%
  filter(disturbance == "low") %>%
  filter(!metaecosystem_type == "S_L") %>%
  ggplot(aes(x = day,
             y = regional_mean_bioarea,
             group = interaction(day, metaecosystem_type),
             fill = metaecosystem_type)) +
  geom_boxplot() + 
  labs(title = "Disturbance = low",
       x = "Day",
       y = "Local bioarea (something/μl)",
       fill = "") + 
  #scale_fill_discrete(labels = c("isolated large", "isolated medium", "isolated small")) +
  theme_bw() + 
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  scale_fill_discrete(labels = c("large-large",
                                 "medium-medium",
                                 "small-small")) +
  geom_vline(xintercept = first_perturbation_day + 0.7, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

ds_regional %>%
  filter(disturbance == "high") %>%
  filter(!metaecosystem_type == "S_L") %>%
  ggplot(aes(x = day,
             y = regional_mean_bioarea,
             group = interaction(day, metaecosystem_type),
             fill = metaecosystem_type)) +
  geom_boxplot() + 
  labs(title = "Disturbance = high",
       x = "Day",
       y = "Local bioarea (something/μl)",
       fill = "") + 
  #scale_fill_discrete(labels = c("isolated large", "isolated medium", "isolated small")) +
  theme_bw() + 
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  scale_fill_discrete(labels = c("large-large",
                                 "medium-medium",
                                 "small-small")) +
  geom_vline(xintercept = first_perturbation_day + 0.7, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

Interesting. It seems like there’s not much difference between the medium-medium and the large-large.

Local biomass

Small patches

### --- SINGLE ECOSYSTEMS FOR ALL SMALL PATCHES --- ###

ds_biomass %>%
  filter(disturbance == "low") %>%
  #filter(eco_metaeco_type == "S (S_S)" | eco_metaeco_type == "S (S_L)") %>%
  filter(patch_size == "S") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = system_nr,
             fill = system_nr,
             color = system_nr,
             linetype = eco_metaeco_type)) +
  geom_line(stat = "summary", fun = "mean") + 
  labs(x = "Day",
       y = "Local bioarea (something/μl)",
       title = "Disturbance = low",
       linetype = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("small isolated",
                                     "small connected to small",
                                     "small connected to large"))  +
  geom_vline(xintercept = first_perturbation_day, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

ds_biomass %>%
  filter(disturbance == "high") %>%
  #filter(eco_metaeco_type == "S (S_S)" | eco_metaeco_type == "S (S_L)") %>%
  filter(patch_size == "S") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = system_nr,
             fill = system_nr,
             color = system_nr,
             linetype = eco_metaeco_type)) +
  geom_line(stat = "summary", fun = "mean") + 
  labs(title = "Disturbance = high",
       x = "Day",
       y = "Local bioarea (something/μl)",
       linetype = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("small isolated",
                                     "small connected to small",
                                     "small connected to large"))  +
  geom_vline(xintercept = first_perturbation_day, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

This is quite messy, as the three types overlap quite a bit. I will then just show the difference between a small patch connected to another small and the one connected to a large one.

### --- SINGLE ECOSYSTEMS FOR SMALL PATCHES CONNTED TO S OR L --- ###

ds_biomass %>%
  filter(disturbance == "low") %>%
  filter(eco_metaeco_type == "S (S_S)" | eco_metaeco_type == "S (S_L)") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = system_nr,
             fill = system_nr,
             color = system_nr,
             linetype = eco_metaeco_type)) +
  geom_line(stat = "summary", fun = "mean") + 
  labs(title = "Disturbance = low",
       x = "Day",
       y = "Local bioarea (something/μl)",
       linetype = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("small connected to small",
                                     "small connected to large"))

ds_biomass %>%
  filter(disturbance == "high") %>%
  filter(eco_metaeco_type == "S (S_S)" | eco_metaeco_type == "S (S_L)") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = system_nr,
             fill = system_nr,
             color = system_nr,
             linetype = eco_metaeco_type)) +
  geom_line(stat = "summary", fun = "mean") + 
  labs(x = "Day",
       y = "Local bioarea (something/μl)",
       title = "Disturbance = high",
       linetype = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("small connected to small",
                                     "small connected to large")) +
  geom_vline(xintercept = first_perturbation_day, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

### --- BOXPLOTS FOR SMALL PATCHES CONNTED TO S OR L --- ###

local_small_low_plot = ds_biomass %>%
  filter(disturbance == "low") %>%
  filter(patch_size == "S") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = interaction(day,eco_metaeco_type),
             fill = eco_metaeco_type)) +
  geom_boxplot() +
  labs(x = "Day",
       y = "Local bioarea (something/μl)",
       title = "Disturbance = low",
       fill = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_fill_discrete(labels = c("small isolated", 
                                 "small connected to small",
                                 "small connected to large")) +
  geom_vline(xintercept = first_perturbation_day + 0.7, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")
local_small_low_plot

ds_biomass %>%
  filter(disturbance == "high") %>%
  filter(patch_size == "S") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = interaction(day,eco_metaeco_type),
             fill = eco_metaeco_type)) +
  geom_boxplot() +
  labs(title = "Disturbance = high",
       x = "Day",
       y = "Local bioarea (something/μl)",
       fill = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_fill_discrete(labels = c("small isolated", 
                                 "small connected to small",
                                 "small connected to large")) +
  geom_vline(xintercept = first_perturbation_day + 0.7, 
             linetype="dotdash", 
             color = "grey", 
             size=0.7) +
  labs(caption = "Vertical grey line: first perturbation")

Large patches

### --- SINGLE PATCHES --- ###

ds_biomass %>%
  filter(disturbance == "low") %>%
  filter(patch_size == "L") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = system_nr,
             fill = system_nr,
             color = system_nr,
             linetype = eco_metaeco_type)) +
  geom_line(stat = "summary", fun = "mean") + 
  labs(x = "Day",
       y = "Local bioarea (something/μl)",
       title = "Disturbance = low",
       linetype = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("large isolated",
                                     "large connected to large",
                                     "large connected to small"))

ds_biomass %>%
  filter(disturbance == "high") %>%
  filter(patch_size == "L") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = system_nr,
             fill = system_nr,
             color = system_nr,
             linetype = eco_metaeco_type)) +
  geom_line(stat = "summary", fun = "mean") + 
  labs(x = "Day",
       y = "Local bioarea (something/μl)",
       title = "Disturbance = high",
       linetype = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("large isolated",
                                     "large connected to large",
                                     "large connected to small"))

### --- BOXPLOTS --- ###

ds_biomass %>%
  filter(disturbance == "low") %>%
  filter(patch_size == "L") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = interaction(day,eco_metaeco_type),
             fill = eco_metaeco_type)) +
  geom_boxplot() +
  labs(x = "Day",
       y = "Local bioarea (something/μl)",
       title = "Disturbance = low",
       fill = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_fill_discrete(labels = c("large isolated", 
                                 "large connected to large",
                                 "large connected to small"))

local_large_low_plot = ds_biomass %>%
  filter(disturbance == "high") %>%
  filter(patch_size == "L") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = interaction(day,eco_metaeco_type),
             fill = eco_metaeco_type)) +
  geom_boxplot() +
  labs(x = "Day",
       y = "Local bioarea (something/μl)",
       title = "Disturbance = high",
       fill = "") +
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank(),
        legend.position = c(.95, .95),
        legend.justification = c("right", "top"),
        legend.box.just = "right",
        legend.margin = margin(6, 6, 6, 6)) +
  scale_fill_discrete(labels = c("large isolated", 
                                 "large connected to large",
                                 "large connected to small"))

L vs L (small-large)

ds_local_S_t2_t7 = ds_biomass %>%
  filter (eco_metaeco_type == "L" |
          eco_metaeco_type == "L (S_L)") %>%
  filter(time_point >= 2) #Let's take off the first two time points which are before the first disturbance event.

full_model = lmer(bioarea_per_volume ~
                    metaecosystem_type * disturbance  +
                    (1 | day),
                  data = ds_local_S_t2_t7,
                  REML = FALSE)

no_metaeco_type_model = lmer(bioarea_per_volume ~
                    disturbance +
                    (1 | day),
                  data = ds_local_S_t2_t7,
                  REML = FALSE)

anova(full_model, no_metaeco_type_model)
## Data: ds_local_S_t2_t7
## Models:
## no_metaeco_type_model: bioarea_per_volume ~ disturbance + (1 | day)
## full_model: bioarea_per_volume ~ metaecosystem_type * disturbance + (1 | day)
##                       npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## no_metaeco_type_model    4 2588.3 2600.6 -1290.1   2580.3                     
## full_model               6 2564.9 2583.3 -1276.4   2552.9 27.422  2   1.11e-06
##                          
## no_metaeco_type_model    
## full_model            ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
r.squaredGLMM(full_model)
##           R2m       R2c
## [1,] 0.112824 0.5984664
r.squaredGLMM(no_metaeco_type_model)
##             R2m       R2c
## [1,] 0.03703173 0.5201834
summary(full_model)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: bioarea_per_volume ~ metaecosystem_type * disturbance + (1 |      day)
##    Data: ds_local_S_t2_t7
## 
##      AIC      BIC   logLik deviance df.resid 
##   2564.9   2583.3  -1276.4   2552.9      154 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.20824 -0.71261 -0.04693  0.57508  2.62955 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  day      (Intercept) 528753   727.2   
##  Residual             437178   661.2   
## Number of obs: 160, groups:  day, 6
## 
## Fixed effects:
##                                      Estimate Std. Error t value
## (Intercept)                           2341.36     315.20   7.428
## metaecosystem_typeS_L                 -563.64     147.85  -3.812
## disturbancelow                         409.35     147.85   2.769
## metaecosystem_typeS_L:disturbancelow   -18.07     209.09  -0.086
## 
## Correlation of Fixed Effects:
##             (Intr) mt_S_L dstrbn
## mtcsyst_S_L -0.235              
## disturbnclw -0.235  0.500       
## mtcsys_S_L:  0.166 -0.707 -0.707

Isolated patches

ds_biomass %>%
  filter ( disturbance == "low") %>%
  filter(metaecosystem == "no") %>%
  group_by (system_nr, day, patch_size) %>%
  summarise(mean_bioarea_per_volume_across_videos = mean(bioarea_per_volume)) %>%
  ggplot (aes(x = day,
                y = mean_bioarea_per_volume_across_videos,
                group = system_nr,
                fill = system_nr,
              color = system_nr,
                linetype = patch_size)) +
    geom_line () +
    labs(x = "Day", 
         y = "Regional bioarea (something/µl)",
         title = "Disturbance = low",
         fill = "System nr",
         linetype = "") +
    scale_y_continuous(limits = c(0, 6250)) +
    scale_x_continuous(limits = c(-2, 30)) +
  scale_colour_continuous(guide = "none") +
    theme_bw() +
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("large isolated",
                                     "medium isolated",
                                     "small isolated"))

ds_biomass %>%
  filter ( disturbance == "high") %>%
  filter(metaecosystem == "no") %>%
  group_by (system_nr, day, patch_size) %>%
  summarise(mean_bioarea_per_volume_across_videos = mean(bioarea_per_volume)) %>%
  ggplot (aes(x = day,
                y = mean_bioarea_per_volume_across_videos,
                group = system_nr,
                fill = system_nr,
              color = system_nr,
                linetype = patch_size)) +
    geom_line () +
    labs(x = "Day", 
         y = "Regional bioarea (something/µl)",
         title = "Disturbance = low",
         fill = "System nr",
         linetype = "") +
    scale_y_continuous(limits = c(0, 6250)) +
    scale_x_continuous(limits = c(-2, 30)) +
  scale_colour_continuous(guide = "none") +
    theme_bw() +
    theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)) +
  scale_linetype_discrete(labels = c("large isolated",
                                     "medium isolated",
                                     "small isolated"))

ds_biomass %>%
  filter(disturbance == "low") %>%
  filter(metaecosystem == "no") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = interaction(day, patch_size),
             fill = patch_size)) +
  geom_boxplot() + 
  labs(title = "Disturbance = low",
       x = "Day",
       y = "Local bioarea (something/μl)",
       fill = "") + 
  scale_fill_discrete(labels = c("isolated large", "isolated medium", "isolated small")) +
  theme_bw() + 
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6))

ds_biomass %>%
  filter(disturbance == "high") %>%
  filter(metaecosystem == "no") %>%
  ggplot(aes(x = day,
             y = bioarea_per_volume,
             group = interaction(day, patch_size),
             fill = patch_size)) +
  geom_boxplot() + 
  labs(title = "Disturbance = high",
       x = "Day",
       y = "Local bioarea (something/μl)",
       fill = "") + 
  scale_fill_discrete(labels = c("isolated large", "isolated medium", "isolated small")) +
  theme_bw() + 
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6))

Evaporation

We want to know if there was a systematic bias in the evaporation of different treatments (disturbance, patch size) and whether evaporation changed across time. My expectation would be that we would see a difference among the exchanges 2,3 and the exchanges 4,5,6. This is because in exchange 2,3 cultures were microwaved in 15 tubes for 3 minutes and in exchange 4,5,6 cultures were microwaved in 4 tubes for only 1 minute.

Tidy

#Columns: exchange & evaporation
ds_for_evaporation = gather(ds_for_evaporation, 
                            key = exchange, 
                            value = evaporation, 
                            water_add_after_t2:water_add_after_t6)
ds_for_evaporation[ds_for_evaporation == "water_add_after_t2"] = "2"
ds_for_evaporation[ds_for_evaporation == "water_add_after_t3"] = "3"
ds_for_evaporation[ds_for_evaporation == "water_add_after_t4"] = "4"
ds_for_evaporation[ds_for_evaporation == "water_add_after_t5"] = "5"
ds_for_evaporation[ds_for_evaporation == "water_add_after_t6"] = "6"
ds_for_evaporation$evaporation[ds_for_evaporation$exchange == 2] = ds_for_evaporation$evaporation[ds_for_evaporation$exchange == 2] / 2 #This is because exchange contained the topping up of two exchanges
ds_for_evaporation$evaporation[ds_for_evaporation$exchange == 2] = ds_for_evaporation$evaporation[ds_for_evaporation$exchange == 2] + 2 #We need to add 2 ml to the evaporation that happened at the exchange events 1 and 2. This is because we already added 1 ml of water at exchange 1 and 1 ml of water at exchange 2. 

#Column: nr_of_tubes_in_rack
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 1] = 15
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 2] = 15
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 3] = 15
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 4] = 4
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 5] = 4
ds_for_evaporation$nr_of_tubes_in_rack[ds_for_evaporation$exchange == 6] = 4

Plot

ds_for_evaporation %>%
  filter(disturbance == disturbance) %>%
  ggplot(aes(x = as.character(nr_of_tubes_in_rack),
             y = evaporation)) + 
  geom_boxplot() + 
  labs(x = "Number of tubes in rack", 
       y = "Evaporation (ml)")

ds_for_evaporation %>%
  filter(disturbance == disturbance) %>%
  ggplot(aes(x = as.character(patch_size),
             y = evaporation)) + 
  geom_boxplot() + 
  labs(x = "Patch size", 
       y = "Evaporation (ml)")

ds_for_evaporation %>%
  filter(disturbance == disturbance) %>%
  ggplot(aes(x = as.character(day),
             y = evaporation)) + 
  geom_boxplot() + 
  labs(x = "Day", 
       y = "Evaporation (ml)")

ds_for_evaporation %>%
  filter(disturbance == disturbance) %>%
  ggplot(aes(x = disturbance,
             y = evaporation)) + 
  geom_boxplot() + 
  labs(x = "Disturbance", 
       y = "Evaporation (ml)")

It seems like there is no real difference across time, disturbance, or patch type. However, we could also run a mixed effect model to show that they do not.

Mixed effect model

I’m not going to run it because it takes a long time.

mixed.model = lmer(evaporation  ~ 
                     patch_size * disturbance  * exchange + 
                     (exchange | culture_ID), 
                   data = ds_for_evaporation,
                   REML = FALSE, 
                   control = lmerControl (optimizer = "Nelder_Mead"))

null.model = lm(evaporation ~
                  1, 
                data = ds_for_evaporation)

anova(mixed.model, null.model)

Body size

Aim

Data manipulation

Import

culture_info = read.csv(here("data", "PatchSizePilot_culture_info.csv"), header = TRUE)
load(here("data", "morphology", "t0.RData"));t0 = morph_mvt
load(here("data", "morphology", "t1.RData"));t1 = morph_mvt
load(here("data", "morphology", "t2.RData"));t2 = morph_mvt
load(here("data", "morphology", "t3.RData"));t3 = morph_mvt
load(here("data", "morphology", "t4.RData"));t4 = morph_mvt
load(here("data", "morphology", "t5.RData"));t5 = morph_mvt
load(here("data", "morphology", "t6.RData"));t6 = morph_mvt
load(here("data", "morphology", "t7.RData"));t7 = morph_mvt
rm(morph_mvt)

Tidy

### --- Tidy t0 - t7 data-sets --- ###

#Column: time
t0$time = NA
t1$time = NA

#Column: replicate_video
t0$replicate_video[t0$file == "sample_00001"] = 1
t0$replicate_video[t0$file == "sample_00002"] = 2
t0$replicate_video[t0$file == "sample_00003"] = 3
t0$replicate_video[t0$file == "sample_00004"] = 4
t0$replicate_video[t0$file == "sample_00005"] = 5
t0$replicate_video[t0$file == "sample_00006"] = 6
t0$replicate_video[t0$file == "sample_00007"] = 7
t0$replicate_video[t0$file == "sample_00008"] = 8
t0$replicate_video[t0$file == "sample_00009"] = 9
t0$replicate_video[t0$file == "sample_00010"] = 10
t0$replicate_video[t0$file == "sample_00011"] = 11
t0$replicate_video[t0$file == "sample_00012"] = 12
t1$replicate_video = 1 #In t1 I took only 1 video/culture
t2$replicate_video = 1 #In t2 I took only 1 video/culture
t3$replicate_video = 1 #In t3 I took only 1 video/culture
t4$replicate_video = 1 #In t4 I took only 1 video/culture
t5$replicate_video = 1 #In t5 I took only 1 video/culture
t6 = t6 %>% rename(replicate_video = replicate)
t7 = t7 %>% rename(replicate_video = replicate)


### --- Create ds_body_size dataset --- ###

long_t0 = t0 %>% slice(rep(1:n(), max(culture_info$culture_ID)))
ID_vector = NULL
ID_vector_elongating = NULL
for (ID in 1:max(culture_info$culture_ID)){
  ID_vector = rep(ID, times = nrow(t0))
  ID_vector_elongating = c(ID_vector_elongating, ID_vector)
}
long_t0$culture_ID = ID_vector_elongating
t0 = merge(culture_info,long_t0, by="culture_ID"); rm(long_t0)
t1 = merge(culture_info,t1,by="culture_ID")
t2 = merge(culture_info,t2,by="culture_ID")
t3 = merge(culture_info,t3,by="culture_ID")
t4 = merge(culture_info,t4,by="culture_ID")
t5 = merge(culture_info,t5,by="culture_ID")
t6 = merge(culture_info,t6,by="culture_ID")
t7 = merge(culture_info,t7,by="culture_ID")
ds_body_size = rbind(t0, t1, t2, t3, t4, t5, t6, t7); rm(t0, t1, t2, t3, t4, t5, t6, t7)

### --- Tidy ds_body_size data-set --- ###

#Column: day
ds_body_size$day = ds_body_size$time_point;
ds_body_size$day[ds_body_size$day=="t0"] = "0"
ds_body_size$day[ds_body_size$day=="t1"] = "4"
ds_body_size$day[ds_body_size$day=="t2"] = "8"
ds_body_size$day[ds_body_size$day=="t3"] = "12"
ds_body_size$day[ds_body_size$day=="t4"] = "16"
ds_body_size$day[ds_body_size$day=="t5"] = "20"
ds_body_size$day[ds_body_size$day=="t6"] = "24"
ds_body_size$day[ds_body_size$day=="t7"] = "28"
ds_body_size$day = as.numeric(ds_body_size$day)

#Column: time point
ds_body_size$time_point[ds_body_size$time_point=="t0"] = 0
ds_body_size$time_point[ds_body_size$time_point=="t1"] = 1
ds_body_size$time_point[ds_body_size$time_point=="t2"] = 2
ds_body_size$time_point[ds_body_size$time_point=="t3"] = 3
ds_body_size$time_point[ds_body_size$time_point=="t4"] = 4
ds_body_size$time_point[ds_body_size$time_point=="t5"] = 5
ds_body_size$time_point[ds_body_size$time_point=="t6"] = 6
ds_body_size$time_point[ds_body_size$time_point=="t7"] = 7
ds_body_size$time_point = as.character(ds_body_size$time_point)

#Column: eco_metaeco_type
ds_body_size$eco_metaeco_type = factor(ds_body_size$eco_metaeco_type, 
                             levels=c('S', 'S (S_S)', 'S (S_L)', 'M', 'M (M_M)', 'L', 'L (L_L)', 'L (S_L)'))

#Select useful columns
ds_body_size = ds_body_size %>% 
  select(culture_ID, 
         patch_size, 
         disturbance, 
         metaecosystem_type, 
         mean_area, 
         replicate_video, 
         day, 
         metaecosystem, 
         system_nr, 
         eco_metaeco_type)

#Reorder columns
ds_body_size = ds_body_size[, c("culture_ID", 
            "system_nr", 
            "disturbance", 
            "day",
            "patch_size", 
            "metaecosystem", 
            "metaecosystem_type", 
            "eco_metaeco_type", 
            "replicate_video",
            "mean_area")]

Create size classes data-set

As in Jacquet, Gounand, and Altermatt (2020) I will create 12 size classes.

#### --- PARAMETERS & INITIALISATION --- ###

nr_of_size_classes = 12
largest_size = max(ds_body_size$mean_area)
size_class_width = largest_size/nr_of_size_classes
size_class = NULL

### --- CREATE DATASET --- ###

size_class_boundaries = seq(0, largest_size, by = size_class_width)

for (class in 1:nr_of_size_classes){
  
  bin_lower_limit = size_class_boundaries[class]
  bin_upper_limit = size_class_boundaries[class+1]
  size_input = (size_class_boundaries[class] + size_class_boundaries[class + 1])/2
  
  size_class[[class]] = ds_body_size%>%
    filter(bin_lower_limit <= mean_area) %>%
    filter(mean_area <= bin_upper_limit) %>%
    group_by(culture_ID, 
             system_nr, 
             disturbance, 
             day, 
             patch_size, 
             metaecosystem, 
             metaecosystem_type, 
             eco_metaeco_type, 
             replicate_video) %>% #Group by video
    summarise(mean_abundance_across_videos = n()) %>%
    group_by(culture_ID, 
             system_nr, 
             disturbance, 
             day, 
             patch_size, 
             metaecosystem, 
             metaecosystem_type, 
             eco_metaeco_type) %>% #Group by ID
    summarise(abundance = mean(mean_abundance_across_videos)) %>%
    mutate(log_abundance = log(abundance)) %>%
    mutate(size_class = class) %>%
    mutate(size = size_input) %>%
    mutate(log_size = log(size))
  
}

ds_classes = rbind(size_class[[1]], size_class[[2]], size_class[[3]], size_class[[4]],
                  size_class[[5]], size_class[[6]], size_class[[7]], size_class[[8]],
                  size_class[[9]], size_class[[10]], size_class[[11]], size_class[[12]],)

Plot

Comparison of the three small patches

The code for plotting the size class distribution across small patches doesn’t work for some reason. It displays all plots as if they were the same.

for (time_point in 0:7) {
   
  print(ds_classes %>%
     filter(time_point == time_point) %>%
     filter(disturbance == "low") %>%
     filter(patch_size == "S") %>%
     ggplot(aes(x = log_size,
                y = log_abundance,
                group = interaction(log_size, eco_metaeco_type),
                fill = eco_metaeco_type)) +
     geom_boxplot() +
     labs(x = "Log size (μm2)", y = "Log abundance + 1 (indiv/μm2)", title = paste0("Disturbance = Low, Day = ", time_point*4),           x = "log body size (µm2)", 
          y = "log mean abundance + 1 (indiv/µl)",
          fill='') +
     scale_y_continuous(limits = c(0,5))+
     scale_x_continuous(limits = c(7,10.5)) +
     scale_fill_discrete(labels = c("small isolated",
                                    "small connected to small",
                                    "small connected to large")) +
     theme_bw() +
     theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)))
}

for (time_point in 0:7) {
   
  print(ds_classes %>%
     filter(time_point == time_point) %>%
     filter(disturbance == "high") %>%
     filter(patch_size == "S") %>%
     ggplot(aes(x = log_size,
                y = log_abundance,
                group = interaction(log_size, eco_metaeco_type),
                fill = eco_metaeco_type)) +
     geom_boxplot() +
     labs(x = "Log size (μm2)", y = "Log abundance + 1 (indiv/μm2)", title = paste0("Disturbance = High, Day = ", time_point*4), 
          x = "log body size (µm2)", 
          y = "log mean abundance + 1 (indiv/µl)",
          fill='') +
     scale_y_continuous(limits = c(0,5))+
     scale_x_continuous(limits = c(7,10.5)) +
     scale_fill_discrete(labels = c("small isolated",
                                    "small connected to small",
                                    "small connected to large")) +
     theme_bw() +
     theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)))
}

Comparison of different time points

for (disturbance in c("low", "high")) {

  print(paste("Disturbance =", disturbance))
  
print(ds_classes %>%
  filter(eco_metaeco_type == "S") %>%
  filter(disturbance == disturbance) %>%
  ggplot(aes(x = log_size,
             y = log_abundance,
             group = interaction(log_size, day),
             color = day)) +
  labs(x = "Log size (μm2)", y = "Log abundance + 1 (indiv/μm2)", title = "small isolated") +
  geom_point(stat = "summary", fun = "mean") +
  geom_line(stat = "summary", fun = "mean", aes(group = day)) +
  scale_fill_gradient(low="blue", high="yellow") + 
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)))

print(ds_classes %>%
  filter(eco_metaeco_type == "S (S_S)") %>%
  filter(disturbance == disturbance) %>%
  ggplot(aes(x = log_size,
             y = log_abundance,
             group = interaction(log_size, day),
             color = day)) +
  labs(x = "Log size (μm2)", y = "Log abundance + 1 (indiv/μm2)", title = "small connected to small") +
  geom_point(stat = "summary", fun = "mean") +
  geom_line(stat = "summary", fun = "mean", aes(group = day)) +
  scale_fill_gradient(low="blue", high="yellow") + 
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)))

print(ds_classes %>%
  filter(eco_metaeco_type == "S (S_L)") %>%
  filter(disturbance == "low") %>%
  ggplot(aes(x = log_size,
             y = log_abundance,
             group = interaction(log_size, day),
             color = day)) +
  labs(x = "Log size (μm2)", y = "Log abundance + 1 (indiv/μm2)", title = "small connected to large") +
  geom_point(stat = "summary", fun = "mean") +
  geom_line(stat = "summary", fun = "mean", aes(group = day)) +
  scale_fill_gradient(low="blue", high="yellow") + 
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)))

print(ds_classes %>%
  filter(eco_metaeco_type == "M") %>%
  filter(disturbance == disturbance) %>%
  ggplot(aes(x = log_size,
             y = log_abundance,
             group = interaction(log_size, day),
             color = day)) +
  labs(x = "Log size (μm2)", y = "Log abundance + 1 (indiv/μm2)", title = "medium isolated") +
  geom_point(stat = "summary", fun = "mean") +
  geom_line(stat = "summary", fun = "mean", aes(group = day)) +
  scale_fill_gradient(low="blue", high="yellow") + 
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)))

print(ds_classes %>%
  filter(eco_metaeco_type == "M (M_M)") %>%
  filter(disturbance == disturbance) %>%
  ggplot(aes(x = log_size,
             y = log_abundance,
             group = interaction(log_size, day),
             color = day)) +
  labs(x = "Log size (μm2)", y = "Log abundance + 1 (indiv/μm2)", title = "medium connected to medium") +
  geom_point(stat = "summary", fun = "mean") +
  geom_line(stat = "summary", fun = "mean", aes(group = day)) +
  scale_fill_gradient(low="blue", high="yellow") + 
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)))

print(ds_classes %>%
  filter(eco_metaeco_type == "L") %>%
  filter(disturbance == disturbance) %>%
  ggplot(aes(x = log_size,
             y = log_abundance,
             group = interaction(log_size, day),
             color = day)) +
  labs(x = "Log size (μm2)", y = "Log abundance + 1 (indiv/μm2)", title = "large isolated") +
  geom_point(stat = "summary", fun = "mean") +
  geom_line(stat = "summary", fun = "mean", aes(group = day)) +
  scale_fill_gradient(low="blue", high="yellow") + 
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)))

print(ds_classes %>%
  filter(eco_metaeco_type == "L (L_L)") %>%
  filter(disturbance == disturbance) %>%
  ggplot(aes(x = log_size,
             y = log_abundance,
             group = interaction(log_size, day),
             color = day)) +
  labs(x = "Log size (μm2)", y = "Log abundance + 1 (indiv/μm2)", title = "large connected to large") +
  geom_point(stat = "summary", fun = "mean") +
  geom_line(stat = "summary", fun = "mean", aes(group = day)) +
  scale_fill_gradient(low="blue", high="yellow") + 
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)))

print(ds_classes %>%
  filter(eco_metaeco_type == "L (S_L)") %>%
  filter(disturbance == disturbance) %>%
  ggplot(aes(x = log_size,
             y = log_abundance,
             group = interaction(log_size, day),
             color = day)) +  labs(x = "Log size (μm2)", y = "Log abundance + 1 (indiv/μm2)", title = "large connected to small") +
  geom_point(stat = "summary", fun = "mean") +
  geom_line(stat = "summary", fun = "mean", aes(group = day)) +
  scale_fill_gradient(low="blue", high="yellow") + 
  theme_bw() +
  theme(panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          legend.position = c(.95, .95),
          legend.justification = c("right", "top"),
          legend.box.just = "right",
          legend.margin = margin(6, 6, 6, 6)))

}
## [1] "Disturbance = low"

## [1] "Disturbance = high"

Figures

Regional bioarea density (mean bioarea density between two patches) in meta-ecosystems of the same total area, but whose two patches have either the same size (red, medium-medium meta-ecosystem) or that have a smaller and larger patch (blue, small-large meta-ecosystem). Points represent the mean, error bars represent the standard deviation. For clarity, only the low disturbance treatment is shown here. See the Appendix for equivalent the figure of the high disturbance treatment.

Regional bioarea density (mean bioarea density between two patches) in meta-ecosystems of the same total area, but whose two patches have either the same size (red, medium-medium meta-ecosystem) or that have a smaller and larger patch (blue, small-large meta-ecosystem). Points represent the mean, error bars represent the standard deviation. For clarity, only the low disturbance treatment is shown here. See the Appendix for equivalent the figure of the high disturbance treatment.

Local biomass production in patches that are either connnected to a patch of the same size (green, small patches of small-small meta-ecosystemes) or to a patch of larger size (orange, small patches of small-large meta-ecosystems). Points represent the mean, error bars represent the standard deviation. For clarity, only the low disturbance treatment is shown here. See the Appendix for equivalent the figure of the high disturbance treatment. Problem: the small patches in green actually have different sample size than the one in orange. Does it still make sense to include the standard deviation?

Local biomass production in patches that are either connnected to a patch of the same size (green, small patches of small-small meta-ecosystemes) or to a patch of larger size (orange, small patches of small-large meta-ecosystems). Points represent the mean, error bars represent the standard deviation. For clarity, only the low disturbance treatment is shown here. See the Appendix for equivalent the figure of the high disturbance treatment. Problem: the small patches in green actually have different sample size than the one in orange. Does it still make sense to include the standard deviation?

Tests

Evaporation when microwaving 15 falcon tubes at the time

evaporation.test = read.csv(here("data", "evaporation_test","evaporation_test_right.csv"), header = TRUE)

evaporation.test %>%
  ggplot(aes (x = as.character(water_pipetted),
                y = weight_water_evaporated,
                group = interaction(water_pipetted, as.character(rack)),
                fill = as.character(rack))) +
  geom_boxplot() +
  labs(x = "Water volume (ml)" , 
       y = "Evaporation (g)", 
       fill = "Rack replicate")

Evaporation when microwaving 5 tubes with 10 filled or empty tubes

evaporation.test = read.csv(here("data", "evaporation_test", "evaporation_test_fill_nofill.csv"), header = TRUE)

evaporation.test %>%
  ggplot(aes (x = all_tubes_water,
              y = weight_water_evaporated)) +
  geom_boxplot() +
  labs(x = "Water in the other 10 tubes" , 
  y = "Evaporation (g)", 
  caption = "When all tubes were filled, they were filled with 6.75 ml of deionised water (I think, but I need to check in my lab book.")

Running time

## Time difference of 37.98438 secs

Other

Mixed effects models

  • To build the mixed effect models we will use the R package lme4. See page 6 of this PDF to know more about the syntaxis of this package.

  • To do model diagnostics of mixed effect models, I’m going to look at the following two plots (as suggested by Zuur et al. (2009), page 487):

    • Quantile-quantile plots

    • Partial residual plots

  • The effect size of the explaining variables is calculated in the mixed effect models as marginal and conditional r squared. The marginal r squared is how much variance is explained by the fixed effects. The conditional r squared is how much variance is explained by the fixed and the random effects. The marginal and conditional r squared are calculated using the package MuMIn. The computation is based on the methods of Nakagawa, Johnson, and Schielzeth (2017). For the coding and interpretation of these r squared check the documentation for the r.squaredGLMM function

  • See for the interaction syntaxis this link.

Model selection

  • I am starting from the largest model and then simplifying because … (see statistical modelling course at ETH).

  • I am going to select the best model according to AIC. BIC is better for understanding and AIC for predicting. Halsey (2019) also suggests this approach instead of p values. I’m going to use AIC because I’m interested in knowing how much meta-ecosystem type contributed to the overall regional biomass.

Bibliography

Halsey, Lewis G. 2019. The reign of the p-value is over: What alternative analyses could we employ to fill the power vacuum? Biology Letters 15 (5). https://doi.org/10.1098/rsbl.2019.0174.
Jacquet, Claire, Isabelle Gounand, and Florian Altermatt. 2020. How pulse disturbances shape size-abundance pyramids.” Ecology Letters 23 (6): 1014–23. https://doi.org/10.1111/ele.13508.
Nakagawa, Shinichi, Paul C. D. Johnson, and Holger Schielzeth. 2017. The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded.” Journal of the Royal Society Interface 14 (134). https://doi.org/10.1098/rsif.2017.0213.
Zuur, Alain F., Elena N. Ieno, Neil Walker, Anatoly A. Saveliev, and Graham M. Smith. 2009. Mixed effects models and extensions in ecology with R. Vol. 36. Statistics for Biology and Health. New York, NY: Springer New York. https://doi.org/10.1007/978-0-387-87458-6.